Automatic Ontology-Based Knowledge Extraction from Web Documents

نویسندگان

  • Harith Alani
  • Sanghee Kim
  • David E. Millard
  • Mark J. Weal
  • Wendy Hall
  • Paul H. Lewis
  • Nigel Shadbolt
چکیده

these documents contain. Manual annotation is impractical and unscalable, and automatic annotation tools remain largely undeveloped. Specialized knowledge services therefore require tools that can search and extract specific knowledge directly from unstructured text on the Web, guided by an ontology that details what type of knowledge to harvest. An ontology uses concepts and relations to classify domain knowledge. Other researchers have used ontologies to support knowledge extraction,1,2 but few have explored their full potential in this domain. The Artequakt project links a knowledge extraction tool with an ontology to achieve continuous knowledge support and guide information extraction. The extraction tool searches online documents and extracts knowledge that matches the given classification structure. It provides this knowledge in a machine-readable format that will be automatically maintained in a knowledge base (KB). Knowledge extraction is further enhanced using a lexicon-based term expansion mechanism that provides extended ontology terminology.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Automatic Extraction of Knowledge from Web Documents

A large amount of digital information available is written as text documents in the form of web pages, reports, papers, emails, etc. Extracting the knowledge of interest from such documents from multiple sources in a timely fashion is therefore crucial. This paper provides an update on the Artequakt system which uses natural language tools to automatically extract knowledge about artists from m...

متن کامل

Linguistic Annotation for the Semantic Web

Establishing the semantic web on a large scale implies the widespread annotation of web documents with ontology-based knowledge markup. For this purpose, tools have been developed that allow for semi-automatic annotation of web documents with ontology-based metadata. However, given that a large number of web documents consist either fully or at least partially of free text, language technology ...

متن کامل

Automatic Topic Ontology Construction Using Semantic Relations from WordNet and Wikipedia

Due to the explosive growth of web technology, a huge amount of information is available as web resources over the Internet. Therefore, in order to access the relevant content from the web resources effectively, considerable attention is paid on the semantic web for efficient knowledge sharing and interoperability. Topic ontology is a hierarchy of a set of topics that are interconnected using s...

متن کامل

Automatic Topic Ontology Construction Using Semantic Relations from WordNet and Wikipedia

Due to the explosive growth of web technology, a huge amount of information is available as web resources over the Internet. Therefore, in order to access the relevant content from the web resources effectively, considerable attention is paid on the semantic web for efficient knowledge sharing and interoperability. Topic ontology is a hierarchy of a set of topics that are interconnected using s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Intelligent Systems

دوره 18  شماره 

صفحات  -

تاریخ انتشار 2003